Nucleic Acid Sample Preparation

ABSTRACT

This invention relates to the preparation of nucleic acid samples for analysis. The invention may be particularly useful for single stranded samples. Embodiments of the invention involve the attachment of double stranded or hairpin oligonucleotides using template independent polymerase enzymes in the preparation of nucleic acid sequencing libraries.

This invention relates to the preparation of nucleic acid samples foranalysis.

Many methods exist for the preparation of samples of double-strandedDNA, for example for sequencing (e.g. Illumina TruSeq and NextEra, 454,NEBnext, Life Technologies etc).

However, the preparation of single-stranded DNA samples is morechallenging because single stranded DNA molecules cannot be efficientlyligated together enzymatically. Reported workflows for the preparationof single-stranded DNA rely on the use of primers with degeneratesequences that “randomly prime” the single-stranded DNA and allow atruncated version of the parent DNA molecule to be adapted (for example,Epigenome™ Methyl-Seq kit, Epicentre Technologies WI USA). Methods usingRNA ligase or CircLigase to join ends of single stranded DNA togetherhave been reported but suffer from poor efficiency or are limited to thesize of DNA fragments that can be ligated together.

Single stranded sample preparation is commonly used following bisulfateconversion of DNA molecules. The bisulfite conversion processnecessarily results in the formation of single stranded DNA, andtherefore involves either i) pre-bisulfite sample preparation or ii)post-bisulfite sample preparation employing random priming fordownstream analysis. Drawbacks to these methods include the potential togenerate nicked or fragmented libraries incapable of subsequentamplification, the loss of sequence information from the parent DNAmolecules, generation of artefacts that contaminate the sample ofinterest or induce significant representation bias of reads in the finaldataset. A direct method of ligating the termini of single stranded DNApost-bisulfite treatment in quantitative yield is of significantinterest.

An aspect of the invention described herein provides a method of joiningtwo oligonucleotides using a template independent nucleic acidpolymerase enzyme such as terminal deoxynucleotidyl transferase (TdT).Terminal deoxynucleotidyl transferase (TdT), also known as DNAnucleotidylexotransferase (DNTT) or terminal transferase, is aspecialized DNA polymerase which catalyses the addition of nucleotidesto the 3′ terminus of a DNA molecule. Unlike most DNA polymerases, itdoes not require a template. TdT typically adds nucleotide5-triphosphates onto the 3′-hydroxyl of a single stranded firstoligonucleotide sequence. The invention as described herein uses asecond oligonucleotide carrying a 5′-triphosphosphate which can beattached to the first oligonucleotide sequence, thus enabling two singlestranded oligonucleotides to be joined together, catalysed by TdT. Thusthe enzyme can be used to link two oligonucleotide strands, rather thansimply adding individual nucleotides.

An aspect of the invention described herein includes the attachment ofadapters to single stranded nucleic acid samples. The adapters areoligonucleotides having a 5′-triphosphate and a 3′ single stranded‘overhang’ region which hybridises to the ends of the nucleic acidsample. The adapters may be in the form of single stranded ‘hairpins’having portions of self-complementary sequences, such that the5′-triphosphate and 3′ single stranded ‘overhang’ region are within thesame molecule, or may be two strands having at least partlycomplementary sequences, one strand having a 5′-triphosphate and thesecond strand having a 3′ single stranded ‘overhang’ region. Theadapters may be used to copy the single stranded samples by extension ofthe 3′-overhang, which is hybridised to the nucleic acid strands andacts as a primer. The attachment of the adapter may be carried out usinga template independent polymerase or a template dependent polymerase.

Polyadenylate polymerase (PAP) is an enzyme involved in the formation ofthe polyadenylate tail of the 3′ end of mRNA. PAP uses adenosinetriphosphate (ATP) to add adenosine nucleotides to the 3′ end of an RNAstrand. The enzyme works in a template independent manner. A furtheraspect of the invention involves the use of PAP to join twooligonucleotides. In the use of PAP, one or more of the oligonucleotidesmay be RNA rather than DNA. Similarly poly(U) polymerase catalyzes thetemplate independent addition of UMP from UTP or AMP from ATP to the 3′end of RNA.

The term template independent nucleic acid polymerase enzyme includesany polymerase which acts without requiring a nucleic acid template. Theterm template independent nucleic acid polymerase enzyme includesterminal deoxynucleotidyl transferase (TdT), Polyadenylate polymerase(PAP) and poly(U) polymerase (PUP). The template independent nucleicacid polymerase enzyme can be PAP. The template independent nucleic acidpolymerase enzyme can be terminal transferase. The oligonucleotidesjoined can be RNA or DNA, or a combination of both RNA and DNA. Theoligonucleotides can contain one or more modified backbone residues,modified sugar residues or modified nucleotide bases.

Template independent nucleic acid polymerase enzymes may be sensitive tothe bulk of sub stituents attached to the ribose 3′-position. Thestandard substrates for these enzymes are nucleotide triphosphates inwhich the ribose 3′-position is a hydroxyl group. In order to increasethe tolerance of the enzyme for sterically larger sub stituents at thisposition, the enzyme may be engineered using suitable amino acidsubstitutions to accommodate any increase in steric bulk. The termtemplate independent nucleic acid polymerase enzyme therefore includesnon-naturally occurring (engineered) enzymes. The term templateindependent nucleic acid polymerase enzyme includes modified versions ofterminal transferase or PAP. Terminal transferase, PUP or PAP may beobtained from commercial sources (e.g. New England Biolabs).

The method described herein adds a single stranded oligonucleotide witha 3′ hydroxyl to a single stranded oligonucleotide with a5′-triphosphate moiety, as shown in FIG. 1. The triphosphate moiety canbe attached directly to the 5′-hydroxyl of the second oligonucleotide.In such cases the 5 ‘-oligonucleotide triphosphate can react directlywith the 3’-hydroxyl group of the first oligonucleotide to form a singlestranded oligonucleotide containing the first and second sequenceslinked together via a standard ‘natural’ phosphomonoester moiety. Suchan oligonucleotide can be copied using a polymerase as there are nounnatural linking groups between the first and second oligonucleotides.The use of engineered template independent polymerase enzymes mayincrease the tolerance for steric bulk at the 3 ‘-position of thetriphosphate nucleotide, and hence allow the use of oligonucleotidestrands attached directly to the 3’-hydroxl of a nucleotidetriphosphate.

Alternatively the triphosphate can be attached through a linker moiety.Linker moieties can be any functionality attached to the terminal5′hydroxyl of the oligonucleotide strand. The linker moiety can includeone or more phosphate groups. The linker may contain a ribose ordeoxyribose moiety. The linker may contain one or more furthernucleotides. The nucleotides, or the ribose or deoxyribose moieties maybe further substituted. The linker may contain a ribose or deoxyribosemoiety in which the oligonucleotide is attached to the 2-position of theribose. The linker may contain a nucleotide in which the remainder ofthe oligonucleotide is attached via the nucleotide base. Suitablelinkers are shown in FIG. 2. Where the generic description ‘linker’ isused, the linker may employ one or more carbon, oxygen, nitrogen orphosphorus atoms. The linker acts merely to attach the functionaltriphosphate moiety to the remainder of the oligonucleotide.

The joined oligonucleotides may be copied using a nucleic acidpolymerase. The linker should be able to permit a nucleotide polymeraseto bridge though the linker in order to copy the strands after joining.The action of the polymerase may be enhanced by using a hybridisedprimer which can bridge across the linker region. The primer can bedesigned with a suitable length of sequence to space across the linkerregion. The sequence can be degenerate/random or simply be a suitablelength of known sequence in order to bridge across any gap caused by thelinker region.

The length of sequence used to bridge the gap can be designed dependingon the choice of linker. The sequence can be used as a tag forindividual fragments. The tag can be used to assess the level of biasintroduced by any amplification reactions. If the tags are say 6 mers ofrandom sequences, there at 4̂6 (4096) different variants of differentsequence. From a population of fragments from a biological sample, it ishighly unlikely that two fragments of the same ‘biological’ sequencewill be joined to a tag with the same ‘tag’ sequence. Therefore anyexamples where the fragments and tag are over-represented in thesequencing reaction occur because the particular individual fragment isover-amplified during the PCR reaction when compared to other fragmentsin the population. Thus the use of ‘tags’ of variable sequence can beused to help normalise the effects of amplification variability.

The tags can also be used to help identify sequences from differentsources. If adapters are used with different sequences for differentsources of biological materials, then the different sources can bepooled but still identified via the tag when the tags are sequenced.Thus the disclosure herein includes the use of two or more differentpopulations of adapters for the multiplexing of the analysis ofdifferent samples. Disclosed herein therefore are kits containing two ormore adapters of different sequence.

The oligonucleotide with the 5′-triphosphate may be blocked at the 3′end to prevent self joining. The blocking moiety may be a phosphategroup or a similar moiety. Alternatively the 3′ end may be a dideoxynucleotide with no 3′-OH group.

The oligonucleotide with the 5′-triphosphate may be produced chemicallyor enzymatically. A suitable nucleotide 5′-triphosphate may bechemically coupled to a suitable oligonucleotide using suitable chemicalcouplings. For example, as shown in the examples, the nucleotidetriphosphate may contain an azido (N₃) group and the oligonucleotide maycontain an alkyne group. Alternatively a suitable oligonucleotidemonophosphate may be turned into a triphosphate either chemically orenzymatically.

The sequence of the 5′-triphosphate adapter oligonucleotide depends onthe specific application and suitable adapter oligonucleotides may bedesigned using known techniques. A suitable adapter oligonucleotide may,for example, consist of 20 to 100 nucleotides. The sequence of theadapter may be selected to be complementary to a suitableamplification/extension primer.

The second oligonucleotide, or oligonucleotide 5′-triphosphate adaptermay be single stranded or double stranded. The double stranded adapterhas at least one overhanging single stranded region, and may have two orthree overhanging single stranded regions. The overhang serves to act tohybridise to the end of the single stranded nucleic acid to which theadapter is to be attached, and acts as a site which can undergopolymerase extension to make the attached single stranded samplemolecules double stranded. The adapters can be ‘forked’ adapters havingregions which are non-complementary as well as regions which arecomplementary.

Where the adapter is hybridised to the nucleic acid sample, theattachment of the adapter can be carried out using a template dependentpolymerase. Any polymerase suitable for the incorporation of anucleotide triphosphate can be used. The adapter can be thought of as anucleotide triphosphate attached to an oligonucleotide duplex. Thus theadapter carries its own template.

The second oligonucleotide, or oligonucleotide 5′-triphosphate adaptermay have a region of self-complementarity such that the secondoligonucleotide may take the form of a hairpin. The hairpin may have3′-overhang suitable for polymerase extension. The term single strandedtherefore includes a single strand which is in part single stranded, andin part double stranded at certain temperatures, but which can be madesingle stranded by increasing the temperature.

The second oligonucleotide may have one or more regions for indexingsuch that different oligonucleotides can be attached to differentsamples, thereby allowing sample pooling.

The second oligonucleotide may have one or more modifications whichallow site specific strand cleavage. The second oligonucleotide may haveone or more uracil bases, thereby allowing site specific cleavage usingenzyme treatment.

The second oligonucleotide may be attached to a solid surface, or maycontain a modification allowing for subsequent immobilisation orcapture. The joining reaction may be carried out on a solid support, orthe joined products may be captured onto a surface after joining. Theoligonucleotides may carry a moiety for surface capture, for example abiotin moiety. Alternatively the attachment may be covalent. Theoligonucleotides may be immobilised on a solid support, and used tocapture the single stranded oligonucleotide fragments.

The second oligonucleotide may be DNA, RNA or a mixture thereof. Wherethe adapter contains two strands, one strand may be DNA and one strandmay be RNA.

Copies of the first single stranded oligonucleotides may be produced byextending the 3′-end of the attached adapter or hairpin. The extensionof the adapter or hairpin produces an extended adapter or hairpin. Wherethe adapter is a hairpin, the extended hairpin can also be described asa double stranded nucleic acid having one end joined. Upon denaturation,the extended hairpin becomes a single stranded molecule, but the lengthof the double stranded portion (for example at least 100 base pairs)means that the sample rapidly hybridise to form the extended hairpin.The extension reaction may be carried out in solution, or on a solidsupport.

In the case where the second adapter is a hairpin, the adapter may befor example 5-20 bases of a first complementary sequence, a singlestranded loop comprising a sequence that hybridises to the solid supportand the sequencing primer (e.g. 50-70 nucleotides), optionally a uniqueindex sequence (e.g. 6-10 nucleotides) and optionally one or morelocations such as uracil for site specific cleavage, a secondcomplementary sequence complementary to the first complementary sequenceand optionally a 3′ overhang (e.g. 1-10 bases). Thus the hairpinconstructs may be 60 to 100 nucleotides or more in length.

The method may be used in order to prepare samples for nucleic acidsequencing. The method may be used to sequence a population of syntheticoligonucleotides, for example for the purposes of quality control.Alternatively, the first oligonucleotides may come from a population ofnucleic acid molecules from a biological sample. The population may befragments of between 100-10000 nucleotides in length. The fragments maybe 200-1000 nucleotides in length. The fragments may be of randomvariable sequence. The order of bases in the sequence may be known,unknown, or partly known. The fragments may come from treating abiological sample to obtain fragments of shorter length than exist inthe naturally occurring sample. The fragments may come from a randomcleavage of longer strands. The fragments may be derived from treating anucleic acid sample with a chemical reagent (for example sodiumbisulfite, acid or alkali) or enzyme (for example with a restrictionendonuclease or other nuclease). The fragments may come from a treatmentstep that causes double stranded molecules to become single stranded.

Methods of the invention may be useful in preparing a population ofnucleic acid strands for sequencing, for example a population ofbisulfite-treated single-stranded nucleic acid fragments. Bisulfitetreatment produces single-stranded nucleic acid fragments, typically ofabout 250-1000 nucleotides in length. The population may be treated withbisulfite by incubation with bisulfite ions (HSO₃ ²⁻). The use ofbisulfite ions (HSO₃ ²⁻) to convert unmethylated cytosines in nucleicacids into uracil is standard in the art and suitable reagents andconditions are well known. Numerous suitable protocols and reagents arealso commercially available (for example, EpiTect™, Qiagen NL; EZ DNAMethylation™ Zymo Research Corp CA; CpGenome Turbo BisulfiteModification Kit, Millipore; TrueMethyl™. Cambridge Epigenetix, UK.Bisulfite treatment converts cytosine and 5-formylcytosine residues in anucleic acid strands into uracil. However, a small proportion ofcytosine and 5-formylcytosine residues are eliminated by bisulfitetreatment rather that converted to U, leading to the formation of abasicsites in the nucleic acid strands, which tends to cause strand cleavage.

The bisulfite solution may be provided in the form of sodium bisulfite,potassium bisulfite or ammonium bisulfite. Where the term bisulfite isused, the bisulfite may be obtained from any source, includingmetabisulfite. Thus the bisulfite may be provided by ammonium, sodium orpotassium bisulfite or metabisulfite.

The sample may be compared with a sample which has not undergonebisulfite treatment. In such cases the sample may be prepared usingalternative fragmentation methods, for example physical shearingfollowed by heat denaturation. Alternatively the sample may be comparedwith a sample which has undergone an alternative sample preparationmethod, for example double stranded ligation of adapters.

In other embodiments, a population of DNA strands having one or moreabasic sites may be produced by subjecting a population of nucleic acidmolecules to acid hydrolysis. The population may be subjected to acidhydrolysis by incubation at an acidic pH (for example, pH 5) andelevated temperature (for example, greater than 70° C.). A proportion ofthe purine bases in the nucleic acid strands will be lost, to generateabasic sites. The number of abasic sites formed depends on the pH,concentration of buffer, temperature and length of incubation.

In other embodiments, a population of DNA strands having one or moreabasic sites may be produced by treating the population of nucleic acidstrands with uracil-DNA glycosylase (UDG). The population may be treatedwith UDG by incubation with UDG at 37° C. UDG excises uracil residues inthe nucleic acid strands leaving abasic sites. UDG may be obtained fromcommercial sources.

Disclosed herein is a method comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining a second oligonucleotide sequence to the sample of        DNA strands using template independent nucleic acid polymerase.

Disclosed herein is a method comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining a second oligonucleotide sequence having a        5′-triphosphate to the sample of DNA strands.

A population of short duplex fragments can be made single stranded usingsuitable treatment steps, for example heat treatment.

The population of nucleic acid molecules may be a sample of DNA or RNA,for example a genomic DNA sample. Suitable DNA and RNA samples may beobtained or isolated from a sample of cells, for example, mammaliancells such as human cells or tissue samples, such as biopsies. In someembodiments, the sample may be obtained from a formalin fixed parafinembedded (FFPE) tissue sample. Suitable cells include somatic andgerm-line cells.

The population may be a diverse population of nucleic acid molecules,for example a library, such as a whole genome library or a loci specificlibrary.

Nucleic acid strands in the population may be amplified nucleic acidmolecules, for example, amplified fragments of the same genetic locus orregion from different samples.

Nucleic acid strands in the population may be enriched. For example, thepopulation may be an enriched subset of a sample produced by pull-downonto a hybridisation array or digestion with a restriction enzyme.

Methods of the invention may be useful in producing populations ofmono-adapted single stranded nucleic acid fragments i.e. nucleic acidstrands having an adapter oligonucleotide attached to their 3′ termini.In some embodiments, populations of 3′ adapted single stranded nucleicacid fragments may be used directly for sequencing and/or amplification.

The sequence of the second oligonucleotide may be entirely known, or mayinclude a variable region. The sequence may a universal sequence suchthat each joined sequence has a common ‘adapter’ sequence attached toone end. The attachment of an adapter to one end of a pool of fragmentsof variable sequence means that copies of the variable sequences can beproduced using a single ‘extension’ primer.

If a single molecule sequencing technique is employed, the firstoligonucleotide sequences may be determined by hybridising the joinedfragments onto a solid support carrying an array of primerscomplementary to the second oligonucleotide sequence. Alternatively thejoined fragments may have a modification at the 3′-end which allowsattachment to a solid support.

The methods disclosed may further include the step of producing one ormore copies of the first single stranded oligonucleotides. The methodsmay include producing multiple copies of each of the differentsequences. The copies may be made by hybridising a primer sequenceopposite a universal sequence on the second oligonucleotide sequence,and using a nucleic acid polymerase to synthesise a complementary copyof the first single stranded sequences. The production of thecomplementary copy provides a double stranded polynucleotide.

The double stranded polynucleotides can be amplified using primerscomplementary to both strands. The amplification can be locus-specific,as shown in FIG. 3. Locus specific amplification only amplifies aselection of the fragments in the pool and is therefore a selectiveamplification for certain sequences. Alternatively a thirdoligonucleotide sequence can be attached to the joined sequences. Theattachment of the third sequence, which may be a second universalsequence, can allow amplification of all the fragments in the pool aseach fragment possesses two universal ends.

Alternatively the double stranded polynucleotides may be made circularby attaching the ends together. In some embodiments, double strandedmolecules produced by extension of a primer annealed to the adaptersequence may be circularised by ligation. This may be useful in thegeneration of circular nucleic acid constructs and plasmids or in thepreparation of samples for sequencing using platforms that employcircular templates (e.g. PacBio SMRT sequencing). In some embodiments,populations of circularised 3′ adapted nucleic acid fragments producedas described herein may be denatured and subjected to rolling circle orwhole genome amplification using an amplification primer that hybridisesto the 3′-adapter oligonucleotide to produce a population ofconcatomeric products. Amplification of circular fragments can becarried out using primers complementary to two regions of the singleadapter sequence.

The third oligonucleotide may comprise a self-complementary doublestranded region. The third oligonucleotide may be attached via ligationusing a ligase. Alternatively the third oligonucleotide may be attachedusing a template independent polymerase as described herein. Theattachment can be via blunt end ligation onto both strands of theextended duplex. Alternatively the ligation may be cohesive ligationusing one or more overhanging complementary bases. Cohesive ligation maybe used to help prevent adapter to adapter ligation. Cohesive ligationincludes having a single base extension (a one base overhang). The onebase overhang on the adapters means the ends of the adapters can notligate to each other.

An alternative to locus specific amplification is the use of randompriming. Random priming is used in techniques such as whole genomeamplification (WGA). Having a universal primer on one end of apopulation of single stranded fragments and a random primer on theopposite end means that amplification is more efficient than havingrandom primers on both ends, as is the case with WGA.

Described herein are kits and components for carrying out the invention.Disclosed is a kit for use in preparing a nucleic acid sample, the kitcomprising a single stranded polynucleotide having a triphosphate moietyat the 5′-end and a terminal transferase. The kit may contain anucleotide 5-triphosphate adapter having any of the features describedherein. The adapter may be in the form of a hairpin. Disclosed hereinare kits containing two or more oligonucleotide adapters of differentsequence, each having a nucleotide 5-triphosphate. The two or moredifferent sequences may include a fixed sequence capable of hybridisingto an extension primer, and a variable sequence which acts as a tag toidentify the adapter (and hence the identify of the sample to which theadapters are attached).

The invention includes a single stranded oligonucleotide comprising atriphosphate moiety attached to the 5′-end via a linker. Theoligonucleotide may have any of the features described herein. Theoligonucleotide may be in the form of a hairpin. The linker may containa nucleotide in which the remainder of the oligonucleotide is attachedvia the nucleotide base.

Certain aspects and embodiments of the invention will now be illustratedby way of example and with reference to the figures described below.

FIG. 1 shows the joining of two oligonucleotide strands using a templateindependent polymerase. The use of DNA is shown, but the oligonucleotidecould be RNA, DNA or a hybrid thereof.

FIG. 2 shows a representation of various types of oligonucleotidetriphosphates. The use of DNA is shown, but the oligonucleotide could beRNA, DNA or a hybrid thereof.

FIG. 3 shows a diagrammatic scheme for the production of a population ofbi-adapted linear DNA strands using locus specific amplification.

FIG. 4 shows a diagrammatic scheme for the production of a population ofbi-adapted linear duplexes by strand extension, circularisation andlinearisation.

FIG. 5 shows a diagrammatic scheme for the production of a singlestranded library using an oligonucleotide triphosphate. Theoligonucleotide triphosphate is produced by reacting an azido dATP withan oligonucleotide containing DBCO. The oligonucleotide 5′-triphosphatecan be joined to the 3′OH of a further oligonucleotide using TdT. Aprimer can be hybridised which bridges the first and secondoligonucleotides and cross the ‘un-natural’ join. The primer hasunspecified ‘N’ bases at the 3′ end to hybridise to each of thedifferent members in the library. The primer can be extended to copy themolecules from the library and produce double stranded fragments. Afurther adapter can be attached to the extended ends of the doublestranded fragments.

FIG. 6 shows a diagrammatic scheme for the production of a singlestranded library using an oligonucleotide triphosphate adapter which isdouble stranded. The method includes the steps of joining twooligonucleotides using a template dependent polymerase or a templateindependent polymerase such as TDT, extension of the complement to theattached oligonucleotide using the 3′-end to form an extended duplex andligation of a further adapter to the extended hairpin. Thus knownregions are added to either end of a single stranded sample.

FIG. 7 shows a diagrammatic scheme for the production of a singlestranded library using an oligonucleotide triphosphate having aself-complementary region (an oligonucleotide triphosphate hairpin). Themethod includes the steps of joining two oligonucleotides using atemplate dependent polymerase or a template independent polymerase suchas TDT, extension of the attached oligonucleotide using the 3′-end toform an extended hairpin and ligation of a further adapter to theextended hairpin.

FIG. 8 shows a diagrammatic scheme for the production of a singlestranded library using an oligonucleotide triphosphate having aself-complementary region (an oligonucleotide triphosphate hairpin). Themethod includes the steps of joining two oligonucleotides using TDT,extension of the attached oligonucleotide using the 3′-end to form anextended hairpin and ligation of a further adapter to the extendedhairpin. The method shown uses a sample and an oligonucleotidetriphosphate containing uracil bases, which upon digestion cleave torelease a single stranded sample where both ends are known sequences.

FIG. 9 shows a diagrammatic scheme for the production of a singlestranded library using an oligonucleotide triphosphate having aself-complementary region (an oligonucleotide triphosphate hairpin). Thehairpin has a region of bases at the 3′ end marked as H (representing‘not G’). The ‘H’ bases hybridise to a sample treated with bisulfite,where the bases are A, T, U and G (i.e. ‘not C’). The method includesthe steps of joining two oligonucleotides using TDT, extension of theattached oligonucleotide using the 3′-end to form an extended hairpinand ligation of a further adapter to the extended hairpin. The methodshown uses a bisulfite treated sample and an oligonucleotidetriphosphate hairpin containing uracil bases, which upon digestioncleave to release a single stranded sample where both ends are knownsequences. The digested single stranded sample contains the H basesinternally to the known 5′ and 3′ ends.

FIG. 10 shows a diagrammatic scheme for the production of a singlestranded library using an oligonucleotide triphosphate having aself-complementary region (an oligonucleotide triphosphate hairpin). Aswith all schemes shown herein, the sample can be DNA or RNA. The sampleis made into single stranded fragments averaging a few hundred basepairs in length. The adapter has a biotin group for surface attachment.The sample can be joined to the adapter in solution, or the adapter canbe pre-attached to a solid support. The material attached to the solidsupport can undergo purification. The sample can be treated to release a3′ hydroxyl. The 3′ hydroxyl of the hairpin can be extended using asuitable enzyme, for example reverse transcriptase (RT) where the sampleis RNA or klenow polymerase where the sample is DNA. The extension canbe carried out using dUTP. The sample can have a second hairpin adapterattached to the extended end. The second hairpin can have a free 3′hydroxyl suitable for extension. Extension of the second adapterproduces a sample which extends ‘back’ through the first hairpinadapter. The extension may be carried out using dTTP instead of dUTP.Digestion with a uracil specific glycosylase results in cleavage ofeverything other than the second (dTTP) extension products. Thusproducing a sample having a sequenceable region of DNA between knownends, the ends being derived from the adapter sequences.

FIG. 11 shows a fluorescent gel indicating the results of joiningoligonucleotides using TdT. The experimental data/gel image above showsthat neither the azido modified Cordycepin triphosphate (N3-ATP) (lane1), or the alkyne modified DNA sequence (Lane 5) is sufficient to shiftthe 5′Fam-DC(U) band. However, a combination of the azido modifiedCordycepin triphosphate (N3-ATP) linked to a 5′ alkyne modified DNAsequence (DBCO-Adapter) (Lane 2, 3 and 4) provides a band shift. Thisprovides direct evidence that a 5′ triphosphate labelled oligonucleotidecan be used to directly adapt the 3′ end of a single strandedoligonucleotide using a template independent polymerase. Lane 3 showsthat a 10 fold excess of both the triphosphate and the adapter causesthe FAM-DC(U) to be substantially all converted to the joinedoligonucleotide.

FIG. 12 shows a 4-20% PAGE TBE gel of OmniPin-adapted templates(templates with a hairpin triphosphate added thereto). The bandshiftobserved in the gel is consistent with each hairpin-triphosphate adaptersuccessfully adding to the 3′ end of the 100 mer CEG_DC_U template.

FIG. 13 shows a 2% agarose gel showing PCR-amplified libraries preparedwith triphosphate-hairpins (OmniPrep libraries). The results show thatboth an on-bead based variant of OmniPrep and the use of dATPαS insteadof dATP are possible and add performance benefits to the ssDNA libraryconstruction method. Furthermore, nuclease treatment prior to ligationcan be used to reduce potential contaminants and unwanted side-products(for example, adapter-dimer) while maintaining the integrity of thesample-prepped library.

Disclosed herein is a method comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining a second oligonucleotide sequence to the sample of        DNA strands using template independent nucleic acid polymerase,        and    -   d) producing a copy of the first single stranded        oligonucleotides.

Disclosed herein is a method comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining a second oligonucleotide sequence having a        5′-triphosphate to the sample of DNA strands, and    -   d) producing a copy of the first single stranded        oligonucleotides.

Disclosed herein is a method of preparing a nucleic acid sample forsequencing comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining an nucleotide triphosphate to the sample of DNA        strands using a nucleic acid polymerase, wherein the nucleotide        triphosphate is part of an oligonucleotide adapter which can        hybridse at least in part with the sample of DNA strands; and    -   d) producing a complementary copy of the first single stranded        oligonucleotides using the extendable 3′-end of the        oligonucleotide adapter,

The method may contain additional steps or features. Additional featuresor steps may include:

The second oligonucleotide, or oligonucleotide adapter may have a regionof self-complementarity such that the second oligonucleotide may takethe form of a hairpin. The hairpin, when in hybridised form, may have3′-overhang suitable for polymerase extension. Hairpins are singleoligonucleotide strands which can form intra-molecular double strandedregions. A hairpin is a nucleic acid sequence containing both a regionof single stranded sequence (a loop region) and regions ofself-complementary sequence such that an intra-molecular duplex can beformed under hybridising conditions (a stem region). The stem may alsohave a single stranded overhang. Thus the hairpin may have more than onesingle stranded region, the loop and the overhang. The hairpin may have3′-overhang suitable for polymerase extension. The overhang may stretchacross the triphosphate ‘linker’ region at the 5′ end, thus avoiding anyissues relating the presence of the 5′-‘linker’ modification requiredfor TDT incorporation. The self-complementary double stranded portionmay be from 5-20 base pairs in length. The overhang may be from 1-10bases in length. The overhang may contain one or more degenerate bases.The sequence may contain a mixture of bases A, C and T at each position(symbolised as H (not G)). H may be used in cases where the sample isbisulfite treated, and thus does not contain any C bases to which the Gwould be complementary. The overhang may consist of 1-10 H bases. Theoverhang may be 2-8 bases, which may be H. The overhang may have a3′-phosphate. The overhang may have a 3′-OH.

Disclosed herein is a method of joining a first single strandedoligonucleotide and an at least partly double stranded oligonucleotideadapter, wherein the first single stranded oligonucleotide is a memberof a population of fragments obtained by cleaving a biological sampleand the second oligonucleotide adapter has a double stranded portion anda 3′-overhang which hybridises to the first single strandedoligonucleotide, where the joining is carried out between a 3′ hydroxylof the first single stranded oligonucleotide and a 5′-triphosphate ofthe adapter. The adapter may consist of one or two strands (i.e. ahairpin or a duplex). The attachment may be catalysed by a templatedependent or template independent polymerase.

Disclosed herein is a method of joining a first single strandedoligonucleotide and an oligonucleotide adapter using a templateindependent nucleic acid polymerase enzyme, wherein the first singlestranded oligonucleotide is a member of a population of fragmentsobtained by cleaving a biological sample and the second oligonucleotideadapter takes the form of a hairpin having a single stranded region anda region of self-complementary double stranded sequence. The region ofself-complementary double stranded sequence is capable of forming aduplex under hybridising conditions. Hybridising conditions may be forexample 50° C. in a standard biological buffer as indicated in theexperimental section below.

The second oligonucleotide may have one or more regions for indexingsuch that different oligonucleotides can be attached to differentsamples, thereby allowing sample pooling.

The second oligonucleotide, or the complement thereof where the secondoligonucleotide is double stranded, is generally a chemicallysynthesised material having known length and modifications. The secondoligonucleotide may have one or more modifications which allow sitespecific strand cleavage. The second oligonucleotide may have one ormore uracil bases, thereby allowing site specific cleavage using enzymetreatment. The second oligonucleotide may be a hairpin and the methodmay include a further step of cleaving the hairpin. Cleavage of thehairpin means the two strands are no longer joined. If more than onecleavage site is present, one of the strands may be fragmented such thatonly one of the two strands remains intact and contains the desiredproperties of having two ends of known sequence. Where the adapter isdouble stranded, the method may include a step of denaturing theextended material.

The second oligonucleotide or the strand hybridised thereto may have oneor more bases which vary in sequence at the same location (i.e. thesecond oligonucleotide is a member of a population of secondoligonucleotides). Such bases may be represented using the universalnucleotide codings known in the art. Such universal bases may berepresented as N (all four bases A, G, C and T). In order to be usedwith bisulfite treated samples, which are depleted in C bases, the 3bases may be represented by H (A, T and C (i.e. ‘not G’). The 3′ end ofthe second oligonucleotide, when in the form of a hairpin, may contain aregion of bases shown as ‘H’.

The second oligonucleotide or the strand hybridised thereto may have oneor more modifications to allow attachment to a solid support. Forexample the second oligonucleotide may contain biotin. Cleavage of thehairpin may allow part of the material to be eluted from the solidsupport in single stranded form, whilst the remaining part staysattached to the solid support.

A solid support is an insoluble, non-gelatinous body which presents asurface on which the polynucleotides can be immobilised. Examples ofsuitable supports include glass slides, microwells, membranes, ormicrobeads. The support may be in particulate or solid form, includingfor example a plate, a test tube, bead, a ball, filter, fabric, polymeror a membrane. Polynucleotides may, for example, be fixed to an inertpolymer, a 96-well plate, other device, apparatus or material which isused in a nucleic acid sequencing or other investigative context. Theimmobilisation of polynucleotides to the surface of solid supports iswell-known in the art. In some embodiments, the solid support itself maybe immobilised. For example, microbeads may be immobilised on a secondsolid surface.

The copies of the first single stranded oligonucleotides may be producedby extending the 3′-end of the attached hairpin or the 3′ end of theduplex where the adapter is double stranded. The extension of thehairpin produces an extended hairpin. The extended hairpin can also bedescribed as a double stranded nucleic acid having one end joined. Upondenaturation, the extended hairpin becomes a single stranded molecule,but the length of the double stranded portion (for example at least 100base pairs) means that the sample rapidly hybridises to form theextended hairpin.

In cases where the adapter or hairpin contains a blocking moiety at the3′ end, the blocking moiety can be removed. For example the 3′ end maybe a phosphate. Methods of the invention may include a step of removingthe phosphate moiety, for example treatment with a suitable kinase suchas polynucleotide kinase (PNK).

Attachment of the 5′-triphosphate oligonucleotide may give rise to ajoin which is not a natural phosphodiester linkage. Such joins may notbe substrates for nucleic acid polymerases. In such cases, the use of3′-overhangs, either as hairpins or double stranded adapters isadvantageous as the linking region can be ‘bridged’ using anoligonucleotide primer sequence which is internal or part of theadapter. Hybridisation of a primer suitable for extension would alsorequire such an internal spacer, and this lowers the affinity andspecificity of the primer hybridisation, whereas no such issues arisewhere the adapter has an ‘internal’ primer which is already hybridised(or in the case of hairpins integral). The attachment of a single‘hairpin’ which can be used as both the known end and the extendableprimer when preparing a library (as shown in FIGS. 6-9) is thereforeadvantageous over the attachment of a single known end followed by thehybridisation of a second primer (as shown in FIG. 5). The pre-formed,or intra-molecular hybridisation spans the unnatural join, and allowsefficient extension.

The extension can be carried out using a suitable enzyme and dNTP's.Where the sample is RNA, a reverse transcriptase can be used to producethe DNA/RNA duplex via the complementary DNA. Where the sample is DNA anucleic acid polymerase can be used. The enzyme can be thermophilic ormesophilic. Suitable polymerases may include Klenow, Taq, Ventpolymerase etc. If cleavage of the extended started is desired, theextension can be carried out using dUTP as a replacement for dTTP. Amixture of dUTP, dATP, dCTP and dGTP allows for complete extension asall four bases are present but allows selective strand cleavage at theuracil nucleotides. If it is desired to leave the strand intact, thendTTP case be used along with dATP, dCTP and dGTP. The nucleotideextension mix can include one or more modified dNTP's. For example thenucleotides may be used such that the resultant extended chains are notsusceptible to exonuclease degradation. One or more of the dNTP's can bealpha-PS dNTPs, such that upon incorporation an exonuclease resistantthiophosphate (PS) linkage is formed.

The extended hairpin has a double stranded end. To which can be attacheda further (third) oligonucleotide. Either or both strands can be adaptedby the attachment of the further oligonucleotide. More commonly a doublestranded adapter would be used, thereby adapting and extending bothstrands. The resultant product could be described as an even furtherextended hairpin. Methods of using hairpins are shown in FIGS. 7-9.

After addition of further adapters, the sample may be treated to removeany adapter-adapter dimers. The treatment may involve exposure to one ormore nucleases. Where the extension sample contains PS linkages, thesample is protected from digestion, whilst the adapters containing no PSlinkages are digested and removed.

If the original sample resulted from a bisulfite treatment step (hencecontaining uracil bases), it is possible to treat the sample to fragmentthe strand at the uracil locations. Inclusion of one or more uracilbases in the oligonucleotide triphosphate means the adapter can also becleaved. Exposure with an enzyme mix such as UDG/EndoVIII (USER) resultsin the formation of fragments having no uracil bases. Thus the hairpinscan be treated to be made single stranded.

Disclosed herein is a method of preparing a nucleic acid sample forsequencing comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining an nucleotide triphosphate to the sample of DNA        strands using a nucleic acid polymerase, wherein the nucleotide        triphosphate is part of an oligonucleotide adapter which can        hybridse at least in part with the sample of DNA strands.    -   d) producing a complementary copy of the first single stranded        oligonucleotides using the extendable 3′-end of the        oligonucleotide adapter,    -   e) attaching a third oligonucleotide to the sample of DNA        strands, and    -   f) denaturing the products of step e; thereby producing a        mixture of nucleic acid molecules where each molecule in the        mixture is a copy of a molecule from the population of nucleic        acid molecules and has a known region at each end.

Disclosed herein is a method of preparing a nucleic acid sample forsequencing comprising;

-   -   a) providing a sample containing a population of nucleic acid        molecules,    -   b) treating the population to produce a sample of DNA strands        containing a mixture of first single stranded oligonucleotides        of different sequence,    -   c) joining a second oligonucleotide sequence to the sample of        DNA strands using template independent nucleic acid polymerase,        wherein the second oligonucleotide sequence is a hairpin having        an extendable 3′-end.    -   d) producing a complementary copy of the first single stranded        oligonucleotides using the extendable 3′-end of the hairpin,    -   e) attaching a third oligonucleotide to the sample of DNA        strands, and    -   f) cleaving the first single stranded oligonucleotides whilst        leaving the copies thereof intact; thereby producing a mixture        of nucleic acid molecules where each molecule in the mixture is        a copy of a molecule from the population of nucleic acid        molecules and has a known region at each end.

Such methods are exemplified in FIGS. 6 to 10.

The joined fragments can be used in any subsequent method of sequencedetermination. For example, the fragments can undergo parallelsequencing on a solid support. In such cases the attachment of universaladapters to each end may be beneficial in the amplification of thepopulation of fragments. Suitable sequencing methods are well known inthe art, and include Illumina sequencing, pyrosequencing (for example454 sequencing) or Ion Torrent sequencing from Life Technologies™).

Populations of nucleic acid molecules with a 3′ adapter oligonucleotideand optionally a 5′ second adapter oligonucleotide may be sequenceddirectly. For example, the sequences of the first and second adapteroligonucleotides may be specific for a sequencing platform. For example,they may be complementary to the flowcell or device on which sequencingis to be performed. This may allow the sequencing of the population ofnucleic acid fragments without the need for further amplification and/oradaptation.

The first and second adapter sequences are different. Preferably, theadapter sequences are not found within the human genome.

The nucleic acid strands in the population may have the same firstadapter sequence at their 3′ ends and the same second adapter sequenceat their 5′ ends i.e. all of the fragments in the population may beflanked by the same pair of adapter sequences.

Adapting a population of single stranded nucleic acid fragments forsequencing as described herein avoids the need to produce copies orcomplementary strands. This is advantageous as it avoids bias introducedby amplification and other processes.

Suitable adapter oligonucleotides for the production of nucleic acidstrands for sequencing may include a region that is complementary to theuniversal primers on the solid support (e.g. a flowcell or bead) and aregion that is complementary to universal sequencing primers (i.e. whichwhen annealed to the adapter oligonucleotide and extended allows thesequence of the nucleic acid molecule to be read). Suitable nucleotidesequences for these interactions are well known in the art and depend onthe sequencing platform to be employed. Suitable sequencing platformsinclude Illumina TruSeq, LifeTech IonTorrent, Roche 454 and PacBio RS.

For example, the sequences of the first and second adapteroligonucleotides may comprise a sequence that hybridises tocomplementary primers immobilised on the solid support (e.g. a 20-30nucleotides); a sequence that hybridises to sequencing primer (e.g. a30-40 nucleotides) and a unique index sequence (e.g. 6-10 nucleotides).Suitable first and second adapter oligonucleotides may be 56-80nucleotides in length.

Following adaptation and/or labelling as described herein, the nucleicacid molecules may be purified by any convenient technique. Followingpreparation, the population of nucleic acid molecules may be provided ina suitable form for further treatment as described herein. For example,the population of nucleic acid molecules may be in aqueous solution inthe absence of buffers before treatment as described herein.

In other embodiments, populations of nucleic acid molecules with a 3′adapter oligonucleotide and optionally a 5′ adapter oligonucleotide, maybe further adapted and/or amplified as required, for example for aspecific application or sequencing platform.

Preferably, the nucleic acid strands in the population may have the samefirst adapter sequence at their 3′ ends and the same second adaptersequence at their 5′ ends i.e.all of the fragments in the population maybe flanked by the same pair of adapters, as described above. This allowsthe same pair of amplification primers to amplify all of the strands inthe population and avoids the need for multiplex amplication reactionsusing complex sets of primer pairs, which are susceptible to mis-primingand the amplification of artefacts.

Suitable first and second amplification primers may be 20-25 nucleotidesin length and may be designed and synthesised using standard techniques.For example, a first amplification primer may hybridise to the firstadapter sequence i.e. the first amplification primer may comprise anucleotide sequence complementary to the first adapter oligonucleotide;and a second amplification primer may hybridises to the complement ofsecond adapter sequence i.e. the second amplification primer maycomprise the nucleotide sequence of the second adapter oligonucleotide.Alternatively, a first amplification primer may hybridise to thecomplement of first adapter sequence i.e. the first amplification primermay comprise a nucleotide sequence of the first adapter oligonucleotide;and a second amplification primer may hybridise to the second adaptersequence i.e. the second amplification primer may comprise thenucleotide sequence of the second adapter oligonucleotide.

In some embodiments, the first and second amplification primers mayincorporate additional sequences.

Additional sequences may include index sequences to allow identificationof the amplification products during multiplex sequencing, or furtheradapter sequences to allow sequencing of the strands using a specificsequencing platform.

EXPERIMENTS

All reagents and buffers are commercially available unless otherwisestated.

Example 1. Preparation of Oligonucleotide Triphosphates

Shown below is the ligation of the azido modified Cordycepintriphosphate (N3-ATP) to a 5′ alkyne modified DNA sequence(DBCO-Adapter) using copper-free click chemistry. The click reactionforms the 5′ triphosphate modified oligonucleotide(ATP-Triazole-Adapter). The 5′ triphosphate modified oligonucleotide(ATP-Triazole-Adapter) is then ligated to the 3′ end a secondoligonucleotide (5′Fam-DC(U)) in a non templated fashion using TdT.

Reaction Conditions.

To 2 μL of DBCO-Adapter DNA (500 mM in DMSO) was added 2 μL of N3-ATP(500 mM in water) and incubated at 37° C. for 1 hr. To the ligatedATP-Triazole-Adapter was added 2 μL of TdT buffer (10×), 2 μL of CoCl2(10×) and 5 μL of 5′-FAM-DC(U) DNA, the reaction was made unto 20 μLwith water and incubated at 37° C. for 30 mins. The reaction mixture wasloaded directly onto a 4% agarose gel using 4 μL of a 6× loading buffer,and run for 3 hr at 90V. The gel was imaged using a Typhoon imager usingthe standard setting for detecting the Fam flurophore (FIG. 11).

The experimental data/gel image (FIG. 11) shows that neither the azidomodified Cordycepin triphosphate (N3-ATP) (lane 1), or the alkynemodified DNA sequence (Lane 5) is not sufficient to shift the5′Fam-DC(U) band. However, a combination of the azido modifiedCordycepin triphosphate (N3-ATP) linked to to a 5′ Alkyne modified DNAsequence (DBCO-Adapter) is (Lane 2, 3 and 4) provides a band shift. Thisprovides direct evidence that a 5′ triphosphate labelled oligonucleotidecan be used to directly adapt the 3′ end of a single strandedoligonucleotide using a template independent polymerase. Lane 3 showsthat a 10 fold excess of both the triphosphate and the adapter causesthe FAM-DC(U) to be substantially all converted to the joinedoligonucleotide.

Example 2. Addition of a Hairpin-Triphosphate Adapter to ssDNAMaterials:

Oligonucleotides used in the experiment are listed in Table 1.

TABLE 1 Oligonucleotide sequences Oligonucleotide Sequence 5′-3′CEG_DC_U pCTCACCCACAACCACAAACATAUGATUAUGGUGAATUUGATUGAATUAGTTUUGUGUTTTAUG AAGTGUGAUAGUUTTAGTGATGTGATGGGTGG TATNNCEG_OP_6H_IDX_1 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_4 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_5 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_6 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_2 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_3 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_12 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_19: DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATTTTCACGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHp DBCO =dibenzocyclooctyne p = phosphate

Modified nucleotide triphosphates used in the experiment are listed inTable 2.

TABLE 2 Nucleotide triphosphates Nucleotide triphosphate StructureN⁶-(6-Azido)hexyl- 2′-dATP (2′dATP- N3, Jena Biosciences P/N NU-1707S)

indicates data missing or illegible when filed

Enzymes used in the experiment are listed below.

Terminal deoxytransferase (TdT, Enzymatics P/N P7070L).

Method: Step 1. Formation of Hairpin-Triphosphate Adapters

To 0.5 μL of Tris-HCl (100 mM, pH 7.0) was added 1 nmol of 2′dATP (2 μLof 500 μM in 10 mM Tris-HCl, pH 7.0) and 1.25 nmol of the CEG_OP_6H_Nadapter (2.5 μL in 500 μM in DMSO) as shown in Table 3. The mixture wasincubated at 10° C. for 2 hr and diluted down to a final concentrationof 100 μM by the addition of 5 μL of Tris-HCl (100 mM, pH 7.0).

TABLE 3 Hairpin-triphosphate mixes Volume (μL) Volume (μL) of N⁶-(6- ofTris-HCl Azido)hexyl- Volume of (100 mM, 2′-dATP CEG_OP_6H_N Ref pH 7.0)(500 μM) (500 μM) CEG19_105_1 0.5 2.0 2.5 (N = IDX 1) CEG19_105_2 0.52.0 2.5 (N = IDX 4) CEG19_105_3 0.5 2.0 2.5 (N = IDX 5) CEG19_105_4 0.52.0 2.5 (N = IDX 6) CEG19_105_5 0.5 2.0 2.5 (N = IDX 2) CEG19_105_6 0.52.0 2.5 (N = IDX 3) CEG19_105_7 0.5 2.0 2.5  (N = IDX 12) CEG19_105_80.5 2.0 2.5  (N = IDX 19)

Exemplary Hairpin-triphosphate structure (e.g. CEG19_105_5)

Step 2. Addition of ssDNA Template with OmniPin Adapter

To 3 pmol of ssDNA template (100 ng, CEG_DC_U) in 7 μL water was added 1μL of CEG TdT 10× Buffer (1 M Tris-acetate, 12.5 mM cobalt acetate, 1.25mg/mL BSA, pH 6.6), 300 pmol of the OmniPin adapter (1 uL ofCEG19_105_1-8, Table 4) followed by 20 U of TdT (20 U/μL). The reactionmixture was incubated at 37° C. for 30 mins before purification of theDNA.

TABLE 4 Adaption mixes Volume (μL) Volume (μL) Volume (μL) Volume (μL)of CEG_DC_U Volume (μL) CEG TdT Buffer CEG19_100_1-8 TdT Ref (100 ng/μL)dH₂O (10x) (100 μM) (10 U/μL) CEG19_105_9 1 6 1 1 (CEG19_105_1) 1CEG19_105_10 1 6 1 1 (CEG19_105_2) 1 CEG19_105_11 1 6 1 1 (CEG19_105_3)1 CEG19_105_12 1 6 1 1 (CEG19_105_4) 1 CEG19_105_13 1 6 1 1(CEG19_105_5) 1 CEG19_105_14 1 6 1 1 (CEG19_105_6) 1 CEG19_105_15 1 6 11 (CEG19_105_7) 1 CEG19_105_16 1 6 1 1 (CEG19_105_8) 1

Results:

Purified hairpin-adapted template products (9 uL each of CEG19_105_9-16)were loaded onto a 4-20% PAGE TBE gel (Life Technologies, P/NEC62255BOX) and ran for 35 minutes at 200 V (FIG. 12). The bandshiftobserved in the gel is consistent with each hairpin-triphosphate adaptersuccessfully adding to the 3′ end of the 100 mer CEG_DC_U template.

Example 3: Whole Human Genome Sequencing Using Libraries Prepared withOmniPrep Single Stranded Library Construction Method Materials:

Oligonucleotides used in the experiment are listed in Table 5.

TABLE 5 Oligonucleotide sequences Oligonucleotide Sequence 5′-3′CEG_OP_6H_IDX_1 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_4 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_5 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_6 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_2 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_3 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_12 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHpCEG_OP_6H_IDX_19 DBCO-GATCGGAAGAGCUCAAGCAGAAGACGGCATACGAGATTTTCACGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATCTHHHHHHp CEG_Frw_AD_UAATGATACGGCGACCACCGAGATCTACACTCT UTCCCTACACGACGCTCTUCCGATCTCEG_Frw_AD_Comp pGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTp Fwd_PCR_Primer AATGATACGGCGACCACCGAGRev_PCR_Primer CAAGCAGAAGACGGCATACGA DBCO = dibenzocyclooctyne p = 2phosphate

Modified nucleotide triphosphates used in the experiment are listed inTable 6.

TABLE 6 Nucleotide triphosphates Nucleotide triphosphate StructureN⁶-(6-Azido)hexyl- 2′-dATP (2′dATP-N3, Jena Biosciences P/N NU-1707S)

indicates data missing or illegible when filed

Enzymes used in the experiment are listed in Table 7 below.

TABLE 7 Enzymes Enzyme Vendor and P/N Terminal deoxytransferase (TdT)Enzymatics P7070L T4 Polynucleotide Kinase (PNK) Enzymatics Y9040LKlenow(exo-) DNA polymerase Enzymatics P7010-HC-L T4 DNA LigaseEnzymatics L6030-HC-L Thermolabile UDG Enzymatics G5020L VeraSeq ULtraDNA polymerase Enzymatics P7520L

Methods Step 1. Formation of Hairpin-Triphosphate Adapters

To 0.5 μL of Tris-HCl (100 mM, pH 7.0) was added 1 nmol of 2′dATP (2 μLof 500 μM in 10 mM Tris-HCl, pH 7.0) and 1.25 nmol of the CEG_OP_6H_*adapter (2.5 μL in 500 μM in DMSO, * denotes separate indexed hairpinadapters each listed in Table 5). Each mixture was incubated at 10° C.for 2 hr and diluted down to a final concentration of 100 μM by theaddition of 5 μL of Tris-HCl (100 mM, pH 7.0).

Step 2. Bisulfite Conversion of Human Genomic DNA

Human Cerebellum genomic DNA (AMSbio, 1 μg) was bisulfite (BS) oroxidative bisulfite (oxBS) converted using the TrueMethyl conversion kit(CEGX) following the manufacturers specification. The DNA was thenquantified by Qubit ssDNA assay kit (Invitrogen).

Step 3. PNK Treatment of Genomic DNA

To 100 ng of either native, BS or oxBS treated human cerebellum gDNA in1×TdT buffer (100 mM Tris-acetate, 1.25 mM CoAc₂, 125 μg/mL BSA, pH 6.6@ 25° C.) supplemented with 10 Units of PNK was added and incubated at37° C. for 20 min. The PNK reaction was stopped by heat denaturating at95° C. for 3 min.

Step 4. Addition of the Hairpin Adapter to the PNK Treated Genomic DNA

To the PNK treated DNA (after heat denaturation) 50 pmols of the Hairpinadapter (hairpin-triphosphate adapter from step 1) and 20 Units of TdTwas added and incubated at 37° C. for 30 min.

Step 4. Magnetic Bead Purification of Hairpin-Adapted Genomic DNA

The hairpin-adapted DNA fragments were purified using magnetic beads(30% PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris pH 8, 0.1% w/v carboxySera-Mag magnetic particles (GE P/N 09-981-123)). Samples were washedusing freshly prepared acetonitrile:water (70:30) and eluted from thebeads in ultra pure water.

Step 5. Klenow Extension and Ligation of the Second Adapter

The purified hairpin-adapted DNA fragments in 1× ligation buffer (50 mMTris-HCl, 10 mM MgCl₂, 5 mM DTT, 1 mM ATP, pH 7.6 @ 25° C.) supplementedwith 1 mM dNTP, 10 U of PNK and 50 U of Klenow(exo-), were incubated at37° C. for 30 min before the PNK and Klenow(exo-) were heat denatured at95° C. for 3 min. The reaction mixture was used directly within theligation reaction by the addition of PEG 6000 to a final concentrationof 7.5%, 0.1 pMols of pre-annealed DNA adapters and 600 U of T4 DNALigase. The mixture was incubated for 15 mins at 25° C. to yield thedoubly-adapted DNA fragment product.

Step 6. Magnetic Bead Purification Doubly-Adapted DNA Fragments

The doubly-adapted DNA fragments were purified twice with a 18% PEGsolution (18% PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris pH 8, 0.1% w/vCarboxy-coated magnetic particles). Samples were washed using freshlyprepared acetonitrile:water (70:30) and eluted from the beads in ultrapure water.

Step 7. UDG Digestion to Yield Final Libraries

The purified doubly-adapted DNA fragments in 1× VeraSeq buffer weretreated with 1 U of thermolibale UDG for 20 mins at 37° C. before theUDG was heat denatured at 95° C. for 5 min. This final library of singlestranded, doubly adapted fragments is referred to as an OmniPreplibrary. Samples were either sequenced directly as PCR-free libraries orPCR-amplified for 10 cycles before sequencing.

Step 8. PCR Amplification of OmniPrep Libraries

PCR amplification of the OmniPrep libraries was performed on AgilentSurecycler 8800 thermocycler in 1× VeraSeq buffer supplemented with 125μM of the forward PCR primer (Fwd_PCR_Primer), 125 μM of the reverse PCRprimer (Rev_PCR_Primer), 500 μM dNTPs and 1 U of VeraSeq 2.0 DNApolymerase. Thermocycling conditions were 10 cycles of:

Denaturation at 95° C. for 30 sec Annealing at 60° C. for 30 secExtension at 72° C. for 90 sec Step 9. Magnetic Bead Purification ofAmplified OmniPrep Libraries

The amplified OmniPrep libraries were purified once with a 18% PEGsolution (18% PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris pH 8, 0.1% w/vCarboxy-coated magnetic particles). Samples were washed using freshlyprepared acetonitrile:water (70:30) and eluted from the beads in ultrapure water.

Step 10. Sequencing and Analysis of the OmniPrep Libraries

Sequencing was carried out on an Illumina NextSeq500 sequencer with apaired end run (2×75 bp). Two individual runs were conducted, one forthe PCR-free libraries, a second for the PCR-amplified libraries.Libraries were prepared in duplicate (native) or triplicate (converted)and pooled to a final concentration of 2 nM, then denatured and dilutedaccording to the manufacturers instructions before Sequencing. The rawoutput fastq read sequences were quality filtered and trimmed usingTrimGalore, the trimmed data was aligned to the human genome (release37.55) with Bismark software. A summary of the sequencing results isshown in Table 8.

TABLE 8 Whole human genome OmniPrep library sequencing metrics # non- #uniquely uniquely mapped mapped Alignment Sample # PE reads reads readsrate PCR_native_rep1 16640651 11507328 457049 69.2% PCR_native_rep214304479 10632055 441680 74.3% PCR_BS_rep1 28416019 17265923 98497060.8% PCR_BS_rep2 7866065 5547305 304164 70.5% PCR_BS_rep3 3233622222677644 1146295 70.2% PCR_oxBS_rep1 35661755 24297095 1246204 68.2%PCR_oxBS_rep2 24124304 17101247 990627 70.9% PCR_oxBS_rep3 3138444222721639 1186467 72.4% PCRfree_native_rep1 41729358 32452776 151645977.8% PCRfree_native_rep2 46284959 34231016 1673857 74.0%PCRfree_BS_rep1 3996331 2282322 148326 57.1% PCRfree_BS_rep2 71409974731152 300306 66.3% PCRfree_BS_rep3 6164831 4115330 258993 66.8%PCRfree_oxBS_rep1 20869094 13311899 863129 63.8% PCRfree_oxBS_rep23776160 2469895 166376 65.5% PCRfree_oxBS_rep3 11869124 8218475 50864169.3%

Results and Observations

The sequencing data clearly demonstrates the successful sequencing ofPCR-amplified and PCR-free OmniPrep libraries prepared using ssDNA fromnative and converted (bisulfite and oxidative-bisulfite) humancerebellum gDNA. High alignment rates indicate that the majority of thedata is comprised of unique reads that align unambiguously to the humangenome. This experiment illustrates that the OmniPrep method can be usedto prepare sequencable libraries from ssDNA that accurately map to theexpected genome of interest.

Example 4. On-Bead OmniPrep Library Preparation Materials

Oligonucleotides used in the experiment are listed in Table 9.

TABLE 9 Oligonucleotide sequences Oligonucleotide Sequence 5′-3′CEG_OPS_6H_Biotin DBCO-GAT(Biotin)CGGAAGAGCUTACACTCTTTCCCTACACGACGCTCTTCCGAT CTHHHHHHp CEG_SHORT_IDX_AD_3PGTGACTGGAGTUCAGACGTGTGCTCTU CCGATCTp CEG_SHORT_IDX_COMP_53PpGATCGGAAGAGCACACGTCTGAACTC CAGTCACp Fwd_PCR_Primer_longAATGATACGGCGACCACCGAGATCTAC ACTCTTTCCCTACACGACGCTCTTCCGA TCTRev_PCR_Primer_long CAAGCAGAAGACGGCATACGAGATCA CTGTGTGACTGGAGTTCAGACGTGTDBCO = dibenzocyclooctyne p = phosphate

Modified nucleotide triphosphates used in the experiment are listed inTable 10.

TABLE 10 Nucleotide triphosphates Nucleotide triphosphate StructureN⁶-(6-Azido)hexyl- 2′-dATP (2′dATP-N3, Jena Biosciences P/N NU-1707S)

2′-Deoxyadenosine- 5′-(α-thio)- triphosphate (dATPαS, Jena BiosciencesNU- 426S)

indicates data missing or illegible when filed

Enzymes used in the experiment are listed in Table 11 below.

TABLE 11 Enzymes Enzyme Vendor and P/N Terminal deoxytransferase (TdT)Enzymatics P7070L T4 Polynucleotide Kinase (PNK) Enzymatics Y9040LKlenow(exo-) DNA polymerase Enzymatics P7010-HC-L T4 DNA LigaseEnzymatics L6030-HC-L Thermolabile UDG Enzymatics G5020L VeraSeq ULtraDNA polymerase Enzymatics P7520L Exonuclease 1 (ExoI) Enzymatics X8010LExonuclease VII (ExoVII) NEB M0379S Exonuclease T (ExoT) NEB M0265SEndonuclease VII (EndoVII) Enzymatics Y9080L

Method Step 1. Formation of Hairpin-Triphosphate (OmniPin) Adapters

To 0.5 μL of Tris-HCl (100 mM, pH 7.0) was added 1 nmol of 2′dATP (2 μLof 500 μM in 10 mM Tris-HCl, pH 7.0) and 1.25 nmol of the CEG OPS 6HBiotin adapter (2.5 μL in 500 μM in DMSO). Each mixture was incubated at10° C. for 2 hr and diluted down to a final concentration of 100 μM bythe addition of 5 μL of Tris-HCl (100 mM, pH 7.0).

Step 2. Bisulfite Conversion of Human Genomic DNA

Human genomic DNA (Promega, 1 μg) was bisulfite (BS) converted using theTrueMethyl conversion kit (CEGX) following the manufacturersspecification. The DNA was then quantified by Qubit ssDNA assay kit(Invitrogen).

Step 3. PNK Treatment of Genomic DNA

To 60 ng of BS converted DNA in 1×TdT buffer (100 mM Tris-acetate, 1.25mM CoAc₂, 125 mg/mL BSA, pH 6.6 @ 25° C.) 10 U of PNK was added andincubated at 37° C. for 20 min. The PNK reaction was stopped bydenaturating at 95° C. for 3 min.

Step 4. Addition of the Hairpin-Triphosphate Adapter to the PNK TreatedGenomic DNA

To the PNK treated DNA after heat denaturation 50 pmols of the OmniPinadapter (hairpin-triphosphate adapter) and 20 U of TdT was added andincubated at 37° C. for 30 min.

Step 5. Binding of OmniPin-Adapted DNA Fragments to Streptavidin CoatedMagnetic Beads

The OmniPin-adapted DNA fragments were bound to 50 μL of streptavidincoated magnetic particles (Life Technologies Dynabeads, M280) in 1× BWbuffer (1 M NaCl, 5 mM Tris-HCl, 0.5 mM EDTA, 0.1% Tween, pH 8.0 at 25°C.) for 30 mins at 25° C.

Step 6. Washing of Immobilised OmniPin-Adapted DNA Fragments

DNA-bound streptavidin coated magnetic particles were precipitated on amagnetic rack and the supernatant was removed and discarded. The beadswere washed twice, by re-suspending in a high stringency wash buffer(0.1×SSC, 0.1×SDS). The beads were finally washed with 1× Ligationbuffer (50 mM Tris-HCl, 10 mM MgCl₂, 5 mM DTT, 1 mM ATP, pH 7.6 @ 25°C.).

Step 7. On-Bead Klenow Extension of OmniPin-Adapted DNA Fragments

The washed DNA-bound streptavidin beads were re-suspended in 1× ligationbuffer (50 mM Tris-HCl, 10 mM MgCl₂, 5 mM DTT, 1 mM ATP, pH 7.6 @ 25°C.) supplemented with 0.25 mM dNTP (dATPαS was used in place of dATP),10 U of PNK and 50 U of Klenow(exo-). The reaction was incubated at 37°C. for 30 min.

Step 8. Washing of Klenow Extended Immobilised OmniPin-Adapted DNAFragments

Extended DNA-bound streptavidin coated magnetic particles wereprecipitated on a magnetic rack and the supernatant was removed. Thebeads were washed twice, by re-suspending in a high stringency washbuffer (0.1×SSC, 0.1×SDS). The beads were finally washed with 1×NEBuffer 4 (50 mM Potassium Acetate, 20 mM Tris-acetate, 10 mM MagnesiumAcetate, 1 mM DTT, pH 7.9 at 25° C.).

Step 9. On-Bead Nuclease Digestion of Immobilized DNA Fragments

A selection of nucleases were independently tested at this stage. Thewashed, Klenow extended, DNA-bound streptavidin beads were resuspendedin 1× NEBuffer4 supplemented with either 10 U of Exo I, or 10 U of ExoVII, or 10 U of Exo T, or 20 U of a combination of Exo I, Exo VII andExo T. The reactions were incubated at 37° C. for 60 min.

Step 10. Washing of Nuclease Digested Immobilised OmniPin-Adapted DNAFragments

Nuclease digested DNA-bound streptavidin coated magnetic particles wereprecipitated on a magnetic rack and the supernatant was removed. Thebeads were washed twice, by re-suspending in a high stringency washbuffer (0.1×SSC, 0.1×SDS). The beads were finally washed with 1×Ligation buffer (50 mM Tris-HCl, 10 mM MgCl₂, 5 mM DTT, 1 mM ATP, pH 7.6@ 25° C.).

Step 11. On-Bead Ligation of the Second Adapter

The washed nuclease-digested DNA-immobilized streptavidin beads werere-suspended in 1× Ligation buffer supplemented with 0.1 pmol ofpre-annealed DNA adapters (equimolar mix of CEG_SHORT_IDX_AD_3P andCEG_SHORT_IDX_COMP_53P) and 600 U of T4 DNA Ligase. The mixture wasincubated for 15 mins at 25° C. to yield the doubly-adapted DNA fragmentproduct.

Step 12. Washing of Doubly-Adapted Immobilized DNA Fragments

The doubly-adapted DNA-immobilized streptavidin coated magneticparticles were precipitated on a magnetic rack and the supernatant wasremoved. The beads were washed twice, by re-suspending in a highstringency wash buffer (0.1×SSC, 0.1×SDS). The beads were finally washedwith 1× VeraSeq Buffer (25 mM TAPS, 50 mM KCl, 2 mM MgCl₂, 1 mM ß-ME, pH9.3 at 25° C.).

Step 13. On-Bead UDG/Endonuclease VII Digestion of Immobilized DoublyAdapted DNA Fragments

The washed doubly-adapted DNA-immobilized streptavidin beads werere-suspended in 1× VeraSeq buffer was treated with 1 U of thermolabileUDG and 10 U of Endonuclease VII for 20 mins at 37° C. before the UDGreaction was stopped by denaturating at 60° C. for 10 min. Thistreatment cuts the desired product from the bead. The streptavidincoated magnetic particles were precipitated on a magnetic rack and thesupernatant was removed and retained for further PCR amplification. Thisfinal library of single stranded, doubly adapted fragments is referredto as an OmniPrep library.

Step 14. PCR Amplification of OmniPrep Libraries

PCR amplification of the OmniPrep libraries was performed on AgilentSurecycler 8800 thermocycler in 1× VeraSeq buffer supplemented with 125μM of the forward PCR primer (Fwd_PCR_Primer_long), 125 μM of thereverse PCR primer (Rev_PCR_Primer_long), 500 μM dNTPs and 1 U ofVeraSeq 2.0 DNA polymerase. Thermocycling conditions were 10 cycles of:

Denaturation at 95° C. for 30 sec Annealing at 60° C. for 30 secExtension at 72° C. for 90 sec Results

The PCR products were loaded on to a 2% agarose gel and ran at 120 V for60 mins (gel shown in FIG. 13). The results show that both an on-beadbased variant of OmniPrep and the use of dATPαS instead of dATP arepossible and add performance benefits to the ssDNA library constructionmethod. Furthermore, nuclease treatment prior to ligation can be used toreduce potential contaminants and unwanted side-products (for example,adapter-dimer) while maintaining the integrity of the sample-preppedlibrary.

1.-24. (canceled)
 25. A method comprising: contacting a single strandedoligonucleotide with an enzyme, wherein said enzyme catalyzes anaddition of a nucleotide to an end of said single strandedoligonucleotide in a template independent manner, wherein saidnucleotide comprises a 5′-triphosphate.
 26. The method of claim 25,wherein said enzyme comprises a terminal transferase (TdT).
 27. Themethod of claim 26, wherein said terminal transferase comprises aterminal deoxynucleotidyl transferase (TdT), a polyadenylate polymerase(PAP) or a poly(U)polymerase (PUP).
 28. The method of claim 25, whereinsaid nucleotide comprises a plurality of nucleotides.
 29. The method ofclaim 25, wherein said single stranded oligonucleotide is produced by atleast one of (i) a bisulfite treatment; (ii) a chemical or enzymaticcleavage; or (iii) use of an enzyme that is a restriction endonucleaseto form said single stranded oligonucleotide.
 30. The method of claim25, wherein a second oligonucleotide strand comprises said nucleotide.31. The method of claim 30, further comprising associating at least aportion of said single stranded oligonucleotide with at least a portionof said second oligonucleotide strand to form a double-strandedoligonucleotide.
 32. The method of claim 31, wherein said secondoligonucleotide strand comprises a hairpin.
 33. The method of claim 32,wherein said hairpin comprises an extendable 3′-end.
 34. The method ofclaim 31, wherein said second oligonucleotide strand is associated witha solid support.
 35. The method of claim 34, wherein said secondoligonucleotide strand comprises a moiety for attachment to said solidsupport.
 36. The method of claim 31, wherein said associating comprisesligation between said single stranded oligonucleotide and said secondoligonucleotide strand.
 37. The method of claim 25, wherein said singlestranded oligonucleotide comprises a plurality of single strandednucleotides.
 38. The method of claim 37, wherein said plurality ofsingle stranded nucleotides comprises a whole genome library or a locispecific library.
 39. The method of claim 37, wherein said plurality ofsingle stranded nucleotides are different.
 40. The method of claim 25,wherein said single stranded oligonucleotide comprises DNA.
 41. Themethod of claim 25, wherein said single stranded oligonucleotidecomprises a modified base.
 42. The method of claim 25, wherein saidenzyme is a non-naturally occurring enzyme.